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Abstract: Amplified consensus genetic marker (ACGM) is a PCR-based 
marker technique that uses primers designed within conserved regions of 
coding sequences. After a comparison of Cryptomeria japonica and 
Arabidopsis ESTs to search for conserved sequences, 237 single e-PCR 
products were obtained. We randomly selected 110 candidate ACGM 
markers to test. Of the 110 candidate ACGM markers tested, 106 yielded 
stable and clear PCR products in C. japonica. We then tested the utility 
of these 106 primer pairs in 10 species, representing 7 genera of Taxodi¬ 
aceae. The number of specific amplification primer pairs among those 10 
species varied from 49 to 103 (or 46.2-97.2%). The 106 primer pairs 
(ACGM loci) were high transferable to Cryptomeria fortunei Hooibrenk 
(97.2%) but were low in Metasequoia glyptostroboides (46.2%). The 
number of PCR bands per primer pair ranged from 1.06 to 1.15, which 
means that most of the ACGM primers can obtain a single band within 
these 10 Taxodiaceae species. In summary, our study shows that ACGM 
is a technique applicable for marker development even in species with 
limited sequence data. 
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Introduction 

Molecular markers have been widely applied in the construction 
of genetic maps and gene mapping, molecular marker-assisted 
selection during breeding, gene cloning, and comparative ge- 
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nomics. Many types of molecular markers have been developed 
since restriction fragment length polymorphisms (RFLP) were 
developed more than three decades ago (Botstein et al. 1980). To 
date, the more powerful and available markers are those based on 
polymerase chain reaction (PCR) techniques. Briefly, there are 
two types of PCR-based markers. One is random primer markers 
which can be used in most species, examples are randomly am¬ 
plified polymorphic DNAs (RAPDs) (Williams et al. 1990) and 
amplified fragment length polymorphisms (AFLPs) (Vos et al. 
1995). The other is locus-specific primers markers such as sim¬ 
ple sequence repeat (SSR), which must be developed from and 
used in target species (Becker et al. 1995). The flexibility of 
random primer markers is unquestioned since they can be used in 
nearly all species; however, their reliability, especially of RAPDs, 
is in doubt to some extent. Although the utility of SSRs in genet¬ 
ics studies is well established, the isolation and characterization 
of such markers via traditional methods are costly and time con¬ 
suming, making the de novo development of SSRs unrealistic for 
some taxa (Pashley et al. 2006). 

Amplified consensus genetic marker (ACGM) is a PCR-based 
marker technique which focuses on primer design within con¬ 
servative regions of coding sequences (Fourmann et al. 2002). 
By comparing regions of homology between closely-related 
species, it is possible to develop ACGM markers for those spe¬ 
cies that have sufficient sequence data (Lu et al. 2006a). Ex¬ 
pressed sequence tags (ESTs) are generated from single-pass 
sequencing of randomly picked cDNA clones (Adams et al. 
1991). For those species with no whole genome sequence avail¬ 
ability, ESTs offer the best solution for consensus region analysis 
of expressed genes within closely-related species. Thus, the use 
of EST databases to develop ACGMs for target species is an 
inexpensive and rapid alternative to using traditional methods for 
the development of locus specific molecular markers (Wang et al. 
2005). 

At present, however, there remain many species with no EST 
data. Yet, it is often research in these species (with little available 
sequence data) that would benefit most from more efficient mo¬ 
lecular marker development to support research into their biology. 
According to Yang et al. (2007), EST-derived markers are likely 
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to be conserved across a broader taxonomic range than any other 
type of marker. Therefore, for those species with a scarcity of 
sequence information, EST data from closely related species 
could be used. For example, Lu et al. (2006b) developed ACGM 
markers in Gramineae using rice databases from the cultivars 
Nipponbare and 93-11. These markers were transferable among 
several grass species. 

The Taxodiaceae family within the gymnosperm class are rel¬ 
ict plants from the Cretaceous period (Wu et al. 1998). The fam¬ 
ily was an important component in forest vegetation of the 
northern hemisphere from the late Cretaceous to the mid-tertiary 
era approximately 115 to 30 million years ago. In the late tertiary 
and pleistocene, however, the family underwent a widespread 
reduction resulting in the present day relictual genera with re¬ 
stricted distributions (Sehlarbaum and Tsuehiya 1984). 

Presently, research within the Taxodiaceae lags behind even 
other tree species. Further, there is limited sequence data avail¬ 
able from Taxodiaceae species. The lack of sequence data makes 
it difficult to develop locus specific primers for these species and 
the choice of markers for genetic and genomic research in these 
species is narrow. Thus, it will be useful and necessary to de¬ 
velop and test additional markers for Taxodiaceae species to 
support further research. Cryptomeria japonica has the most 
available sequence data of all species of Taxodiaceae. By com¬ 
paring C. japonica ESTs with Arabidopsis whole genome coding 
sequences to search for conserved regions of homology, we ex¬ 
ploited the ACGM technique for marker development in the 
Taxodiaceae. This paper provides a case study of the utility of 
available C. japonica EST resources for the development mark¬ 
ers necessary for the genetic analysis of members of the Taxodi¬ 
aceae family. 

Materials and methods 

Search for putative ACGMs 

In total, 56,646 EST sequences of C. japonica , released by the 
Plant Genomics Database (GDB, http://www.plantgdb.org) were 
downloaded. In addition, the coding DNA sequence (CDS) data 
of Arabidopsis (Arabidopsis thaliana ecotype Columbia) were 
downloaded from The Arabidopsis Information Resource 
(http://www.arabidopsis.org/). 

We developed a pipeline using Perl script to search for con¬ 
served regions of homology between C. japonica and Arabidop¬ 
sis. The initial step was to identify the conserved regions within 
ESTs of the available sequences in C. japonica by aligning the 
EST sequences of C. japonica with the CDS of Arabidopsis us¬ 
ing BLASTN.30. A C. japonica EST was thought to be ho¬ 
mologous to an Arabidopsis CDS only if there were at least 200 
bp overlapping and 80% similarity between them. We then iden¬ 
tified the possible varied locus within the ESTs of C. japonica by 
aligning the EST sequences with the CDS of Arabidopsis using 
BLAT. In order to include additional putative polymorphisms, 
putative ACGM should be those EST sequences containing at 
least one varied region. The third step was to design primers for 
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C. japonica ESTs containing possible positions of conserved 
region. For each of the EST, a pair of primers was designed us¬ 
ing program ePrimer3 (Rozen and Skaletsky 2000) on a 200-bp 
sequence cut from the C. japonica EST with 100 bp on each side 
of the target varied locus. 

Targeting of candidate ACGMs by electronic PCR 

The designed primers were tested by electronic PCR (e-PCR) on 
the EST sequences of C. japonica. To increase the quality and 
usability of the in silico predicted ACGM markers, we required 
exact matches between primers and templates and set a 300-bp 
margin for the product size for the e-PCR. We accepted a puta¬ 
tive ACGM locus as a candidate ACGM marker only if those 
primers successfully and uniquely amplified the correct target in 
the e-PCR. The candidate primers were selected and named with 
the abbreviation TA (for Taxodiaceae ACGM) followed by a 
unique number (e.g. TA 11). 

Experimental verification and evaluation of ACGM markers in C. 
japonica by PCR 

Fresh leaves of C. japonica were collected from Hangzhou Plant 
Garden (Zhejiang, China). Total genomic DNA was isolated 
from 200 mg of fresh ground leaf tissue using the CTAB method 
according to Murray et al. (1980). 

All primers used were synthesized by the Nanjing Jinsite Bio¬ 
logical Engineering & Technology Company (Nanjing, China). 
PCR was performed in 20 pL reactions containing 50 ng of tem¬ 
plate DNA, 0.5 pmol/L of each primer, 200 pmol/L of each 
dNTP, 1.5 mmol/L of MgCl 2 , 1 unit of Taq polymerase, and 2 pL 
of 10 x PCR reaction buffer. A touchdown PCR program (Don et 
al. 1991) was used: 5 min at 95°C; 10 cycles of: 30 s at 95°C, 30 
s at 58°C minus 0.3°C per cycle, 1 min at 72°C; 20 cycles of: 30 
s at 95°C, 30 s at 55°C, 1 min at 72°C; and 7 min at 72°C for a 
final extension. For those primer pairs that did not generate good 
amplification results, the initial annealing temperatures were 
adjusted from 55°C to 60°C. Each of the primer pairs was tested 
twice to confirm the repeatability of the observed bands in C. 
japonica. PCR products were separated on 1% agarose gel. Gels 
were stained with ethidium bromide to visualize DNA bands. 

Experimental evaluation of ACGM marker development in other 
Taxodiaceae species 

To evaluate the transferability of ACGMs in Taxodiaceae , ge¬ 
nomic DNA from 10 species (Table 1), representing seven genera 
of Taxodiaceae , were collected from Hangzhou Plant Garden 
(Zhejiang, Chian). Genomic DNA were extracted using the 
CTAB method as described in 2.3. PCR reactions were the same 
as mentioned above. Only clear and reproducible bands were 
counted as specific amplification. Based on the PCR data, we 
evaluated the allelic diversity of each ACGM marker using the 
polymorphism information content (PIC) value defined as PIQ 
=l-£Pi , where Pi is the frequency of the i th marker. 
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Table 1. Accessions used for ACGM analysis 


Accession Species 

number 

genus 

Chromosome 

number* 

1 

Taxodium distichum 

Taxodium 

2n=22 

2 

Taxodium ascendens Brongn 

Taxodium 

2n=22 

3 

Sequoia sempervirens 

Sequoia 

2n=66 

4 

Variety of Sequoia sempervirens 

Sequoia 

unknown 

5 

Cryptomeria fortunei 

Cryptomeria 

2n=22 

6 

Variety of Cryptomeria fortunei Cryptomeria 

Hooibrenk 

unknown 

7 

Cryptomeria japonica 

Cryptomeria 

2n=22 

8 

Metasequoia glyptostroboides 

Metasequoia 

2n=22 

9 

Cunninghamia Lanceolata 

Cunninghamia 

2n=22 

10 

Taiwania cryptomerioides 

Taiwania 

2n=22 

11 

Glyptostrobus pensilis 

Glyptostrobus 

2n=22 


All plant materials were collected from Zhejiang plant garden in China. *data 
from Schlarbaum and Tsuchiya (1984) 

Results 

Candidate ACGM markers 

After screening 56,646 EST sequences from C.japonica with the 
developed Perl pipeline, we obtained 10,289 ESTs carrying puta¬ 
tive ACGM loci. We successfully obtained 1,705 (17%) e-PCR 
products from the putative ACGM loci. That some sequences did 
not yield a product might be due to the stringent conditions for 
e-PCR. Within the successful e-PCR products, there were multi¬ 
ple ACGM loci that obtained the same PCR product. Multi¬ 
ple-gene-copy markers are not desirable for genetic studies, so 
we discarded these putative ACGMs and their primer pairs. After 
filtering out these redundant ACGM loci, we successfully ob¬ 
tained 237 single e-PCR products and 110 of these were ran¬ 
domly selected as candidate ACGM markers. 

Experimental tests of candidate ACGM markers 

Following the in silico analysis, the 110 randomly selected can¬ 
didate ACGM markers were tested in C. japonica experimentally 
by PCR (Fig. 1). Of the 110 candidate ACGM markers tested, 
106 yielded a single, clear PCR product in C. japonica. Theo¬ 
retically, since all the primer pairs were designed using C. ja¬ 
ponica ESTs, all primer pairs should amplify products in the C. 
japonica genome. We suspected that the reasons for amplifica¬ 
tion failure of the four remaining candidate loci might have been 
variation of the primer location within the C. japonica used in 
this study, or sequencing errors in the EST data. 

Transferability of ACGM primers within the Taxodiaceae genera 

The utility of the 106 PCR-confirmed ACGM markers in various 
Taxodiaceae species was tested (Fig. 2). The number of specific 
amplification primer pairs among the 10 species varied from 49 
to 103 (or 46.2%-97.2%, Table 2). The 106 primer pairs (ACGM 


loci) were highly transferable to Cryptomeria fortunei Hooibrenk 
(97.2%) but less transferable to Metasequoia glyptostroboides 
(46.2%). The PCR bands per primer pair varied from 1.06 to 
1.15, which means that most of the ACGM primers obtained 
single band within the 10 Taxodiaceae species. The PIC values 
of the markers varied from 0 to 0.50 with an average of 0.27. The 
polymorphism level of the ACGM marker was generally not 
high. However, a higher estimate of the polymorphism level of 
ACGM markers could probably be obtained if methods of DNA 
fragment analysis with higher resolution capacity (e.g. denatur¬ 
ing PAGE, usually used for SSR analysis or DNA sequencing) 
were adopted. The primer sequences and putative function for 
the 106 new ACGM markers are shown in Table 3. 


Ml 23 4 567 89 10 11 



Fig. 1 Primers screened in C. japonica separated by electrophoresis 
on 1% agrose gel (M representing Marker DL2000). Number 1 to 11 
representing sample of variety of C. fortunei Hooibrenk , variety of S. semper- 
virens, G.pensilis, T. cryptomerioides Hayata, C. japonica, C. fortuneiHooi- 
brenk, C. Lanceolata, M. glyptostroboides, T. ascendens. Brongn, S. semper- 
virens, T. distichum, respectively. 


Ml 2 3 4 5 6 7 8 9 10 II 



Fig. 2 Primers screened in Taxodiaceae separated by electrophoresis 
on 1% agrose gel (M representing DNA Marker DL2000). Explana¬ 
tions of number 1 to 11 are the same as Fig. 1. 


Table 2. PCR results of 106 pairs of ACGM primers in species of 
Taxodiaceae 


Material 

No. 

Specific amplification primer 
pairs 

Bands of PCR product 

Number 

Percentage 

Number 

Bands per primer 
pair 

1 

73 

68.9 

80 

1.1 

2 

83 

78.3 

91 

1.1 

3 

53 

50 

56 

1.06 

4 

62 

58.5 

67 

1.08 

5 

103 

97.2 

115 

1.12 

6 

98 

92.5 

111 

1.13 

8 

49 

46.2 

56 

1.14 

9 

67 

63.2 

77 

1.15 

10 

74 

69.8 

84 

1.14 

11 

84 

79.2 

97 

1.15 

Mean 

68 

64 

— 

1.02 
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Table 3. Sequences and related information of 106 ACGM primers. The function was presumed from conserved regions of homology between C. japonica 
and Arabidopsis. 


Marker Forward primer(5'-3') 

T A1 ACGCC AAG AG AGTC ACC ATC 

TA2 CG AACCT GTT GTTTTT GAT AC A 

TA3 CC AA AT GGT GGGT GTT G AA 

TA4 G AAAT GCG AT G AAAGGG AAA 

TA5 G AG ACCT CT G AGGC AG ATCG 

TA6 T GTT GG AC ATCT GG ACG AAA 

TA7 GGT GT AGT C AACT GGCGTTT G 

TA8 CAAC ACC AAT GGGAGC AACT 

TA9 T G ACC AAGGTT G AC AG ACG A 

T A10 CG AGTCTCT ATCCCGTCG AG 

T A11 CGG AAG AAGAGGCGT AAG AA 

TA12 T GTT GGGGCTTT GCT AGTTT 


T A13 GCCTGT AG AAGC AGGCTTG A 

T A14 CGGGGGCGGTTT AT AT ACTT 

T A15 GGCGTTC AAGCT AG AAG ATG A 

T A16 TGGTAAAAGCAAAAGACGAAA 

T A17 ATCGGCGCCTT ATGGT AACT 

T A18 CCTT CGGCTT G AAAAAC AAG 

T A19 AACCCTC ACTGG AAAG ACC A 

TA20 GGGG A AAG AAG ACCCT GTT G 

TA21 TTCACATTGGAGTCGGTCAA 

TA22 GCAATCTAAGCGTCATGCTG 

TA23 TTACGGTTTCGAGGGTCATC 

TA24 GATTACGGTTTCGAGGGTCA 

TA25 GGGCGACTGTTTACCAAAAA 

TA26 GGGAGAGCAATCTAAGCGTCA 

TA27 GCAATCTAAGCGTCATGCTG 

TA28 ATGGTCACATCTTCGGCTCT 


TA29 TCCATCGCAATCCTTTTCAT 


TA30 CTC AG AT GT GT GGGC AAAG A 

TA31 CCT AG AGGC ACT GGGTT C AA 

TA32 ATTTTCAGGGCCAGATTCCT 

T A3 3 ATGC AGT ATGGGG ATTGC AG 

TA34 GC AGCGT GC AGT GT GT AT G 

T A3 5 GG AGG AGCT AG ACGC AG AG A 

T A3 6 AAACTGCGG ATGGCTC ATT A 

T A3 7 GT ATGGCCGGCT AATTCG AG 

TA3 8 TTATCAGGTCGC AGCGGTA 

T A3 9 AATCCCTT AACG AGG ATCC ATT 

TA40 GAGCCC AAAAT CAGACAAGC 

TA41 GGACGCCAAGAGAAGAAAGA 

TA42 ACTTCACGGAGCACAAAGG 

TA43 TGGGTACATTCTCGAGGGTAA 

TA44 AAGGCACATTCTGAGCGAGT 

TA45 GGTT GC AT C AGCTT GAG AT GT 

TA46 GGCT GG AATT GT G A AG AAGG 

TA47 AAG ATT GT C AGT GGCGT GT G 

TA48 G AC AAGGG AGGTTT GAT GG A 


TA49 GGCCT AGGT GT AAC AT ATT GG A 


TA50 TTCAACGTGCTTGCTTGTTC 

TA51 T G AAAAC AGCTC AATTT GC AC 

TA52 T GAACCGAAAGCCGAGTT AC 


Reverse primer (5'-3') Gene function or putative function 


GCTT AAAC AAA AAGC A AT C ATT AGG 
C ACC AGT C AC AGC A AT C AGC 
CCT GCC ACGTTTT G ATTCTT 
T GT GAT GC AGGT CC ATT AT G A 
T GG AATT GGAT GCTT CCTT C 
TCC AT G AC AC AAAAGT C ACC A 
T G ATTTT AGT AG A A AG AT GT GGGT GT 
CT GCCCTTTT AACT GT CAT GC 
CACCACTTGGCTCCTTCTTC 
GGGT GTTT G AC ACG AACCTT 
AAT GGC A A A A AGCC AC A AT C 


Histone superfamily protein 
Chlorophyll A-B binding family protein 
Pectin lyase-like superfamily protein 
MYB97, AtMYB97 

Glycolipid transfer protein (GLTP) family 
Ribosomal protein S21e 
transposable element gene 
unknown protein 

GTP binding Elongation factor Tu family protein 
Glyptostrobus pensilis internal transcribed spacer 1 
transposable element gene 


TCCTGTGAATCTCACCCAGA 


ATRAB11C, ATRABA2A, ATRAB-A2A, RAB-A2A, 
RABllc 


AAT ATCC AC AGCC AC ATCT GC 
T CAT GC AT AG AATTT GGG AG A A 
TTTTTCGTAATTTCTTTTACGATGAT 
GCCCTTAATGGCCTGATGTA 
GGCTTCCGCATTATTTTCAA 
CTTTTCACCCTTCCGTTGAA 
ATCTTGGCCTTGACGTTGTC 
C ACGAG ATTTCT GTTCT CGTT G 
AGGGCTCTGCTGTTCAATGT 
TTTTTCACCTTTCCCTCACG 
CCTC ATT GT CCC AACC AG AC 
TTT G AAGG AGCC AGTT GG AG 
CTTTCTGTCCAGGTGCAGGT 
TTTTTAGCTATCGGTCACTCAGG 
TGCTCACGGTACTATTTCGCTAT 
CGGACCACCTCGTAATCATC 

CGGT ACCT G AGCC ATTT CAT 

CGGAT GAGTTCCGAAT GTTT 
GGC AGT G AGGGT CT CG AT AA 
CCATTCTGCCGTTACAATGAA 
GAGCCTCGACCCCTTAAGAT 
G AAAA AT C AATCC AAGCC ACT 
GTCCAGAAGGCCGTATGTGT 
CGAACCCTAATTCTCCGTCA 
AGCC AATT AAGGCC AGG AAC 
AGCC AATT AAGGCC AGG AAC 
AGCC AATT AAGGCC AGG AAC 
TCACCCTCCTTTGCTTCATC 
ACACACCACGCATACAAT CC 
GC AATT GT AAC AGCGCCT AA 
GAGCATCCATTCATCTCTCAAA 
T GT GGC AATTT ATT GC AT C ATT 
T GCG AAG AT GC AT GTT GTTT 
GGGAAACAGCGCAAT GAT 
GTCTACGCCAGCTCTCCTTG 
AAAC AAC ATCC AC AGCC AC A 

AAT G AGC AGCC AT GGG AGT 

AATCTTCAAGCTTGGCCGTA 
C ACT AAGCTT GTCGCC ACT G 
T GCT ACT ACC ACC AAG AT CT GC 


RAB18, ATDI8 | Dehydrin family protein 
cytochrome c biogenesis protein family 
transposable element gene 

SYTD, ATSYTD, NTMC2TYPE2.2, NTMC2T2.2, SYT4 
Homeodomain-like superfamily protein 
Zinc finger C-x8-C-x5-C-x3-H type family protein 
polyubiquitin 10 

Glyptostrobus pensilis 26S ribosomal RNA gene 
S-adenosylmethionine synthetase 1 
chloroplast-encoded 23S ribosomal RNA 
NADH dehydrogenase subunit 9 
NAD9 

chloroplast-encoded 23S ribosomal RNA 
chloroplast-encoded 23 S ribosomal RNA 
chloroplast-encoded 23S ribosomal RNA 
Protein kinase superfamily protein 

2-oxoglutarate and Fell-dependent oxygenase superfamily 
protein 

CCCH-type zinc finger protein with ARM repeat domain 

S-adenosyl-L-homocysteine hydrolase 

transposable element gene 

NAC domain containing protein 47 

alpha-1 tubulin 

Ribosomal protein S4 

rRNA 

Symbols: rRNA 

Symbols: rRNA 

Symbols: rRNA 

HEAT SHOCK PROTEIN 89.1 

cold-regulated 47 

Histone superfamily protein 

Ribosomal protein S8e family protein 

Chalcone and stilbene synthase family protein 

transposable element gene 

Translation initiation factor SUI1 family protein 

Ribosomal protein L32e 

unknown protein 

S-adenosyl-L-methionine-dependent methyltransferases 
superfamily protein 
proteasome alpha subunit D2 

pseudogene, similar to putative carboxyl-terminal, proteinase, 
Glyptostrobus pensilis 26S ribosomal RNA gene 
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Continued to Table 3 


Marker 

TA53 

TA54 

TA55 

TA56 

TA57 

TA58 

TA59 

TA60 

TA61 

TA62 

TA63 

TA64 

TA65 

TA66 

TA67 

TA68 

TA69 

TA70 

TA71 

TA72 

TA73 

TA74 

TA75 

TA76 

TA77 

TA78 

TA79 

TA80 

TA81 

TA82 

TA83 

TA84 

TA85 

TA86 

TA87 

TA88 

TA89 


Forward primer(5'-3') 
CGTCACGTAAGTGGGTTGTG 
GG AT GCTT ACCG AAAG A AGG 
GCAATCTAAGCGTCATGCTG 
GTT GT GAAGGGGCAAAAATG 
T AG AGT GC AAG AAGGCGTT G 
AGGGAACGGGCCCTTATATC 
T GTT GC AGGGTTT GT GT AGG 
ATCT GGGT GG AGGG ACTTTT 
AGGACGCTTGTTTCTCTGAATC 
GGGTTTTGTTATACGAGCCTGT 
CAAACCAAAACCAGGCTCTC 
CG AAGG ATTT GGGGTTT GT A 
TT G AGGCT GGGTTTT AC AGG 
CTCACTTGTCTGTATTGGAACGA 
T GT CAT AGG AGGT G A AAATTT GAT 
AGGAGT CC ACCCTTC ACTT G 
AT GGCAGAAAAGGCAAT ACG 
CT GGT G AGGC ACGTTCT GT 
G AGCTT GGC AAGGTT ATT C AA 
GCGT ACC ATTTT GATT GG AG A 
CTTT AGC ACT C AGCG AT C ATTT 
G AGCTCT G AC ACC AT CG AC A 
G AG AAG AAGGC AAT GGT GG A 
AGCAGTTCGGATAGT GACACC 
GCAAGACCATTACCCTGGAA 
ACG AACT CTT CG AGGC AAAA 
GCTT AGT GT GCG ACTCGTT G 
TT GCT GCCGTTTT GGT AAAT 
GG AT AGAGGAGGAGGAAGACG 
CCGCAGCTCAATCATTGTAA 
AGGC AT AC AGGC AGC AG A AT 
TAGCGGGACGTTCTGTGAAT 
T GATT A A AC A AGCGC A A AT C A 
CG AAAGGG AATCGGGTT AAT 
AATCCCTTAACGAGGATCCATT 
CTTCTGCCCCCAAGACCTA 
GC ATT GAA ATTTTCT GCTTT CTG 


TA90 TTGGTGCACATTTTCGTGTT 


TA91 ATTGCAGTTGCAGCCTTTCT 

TA92 CT GT AAGC AAAGCCCCT CTG 

TA93 GCT GC AAACG A AT G AAT GT C 

TA94 C AGC ACC AAG AG AC AACCT G 

TA95 T AG AG AAAGGG ACCGT G ACC 

TA96 AGGAAAGGCCGAGGTTTTAC 

TA97 T CAGCGAAAGAGAAGAAGGAA 

TA98 GCGGAGGCTTTGATTTACAG 

TA99 AATTCTTCGCGGTCTCAGAA 

T A100 GT GT GTT GGGTT GTTT GT GC 

T A101 C AGCG ATTTTG AAGTGTGTG A 

T A102 TG A AGGGG AAT ATG ACG ATG A 

TA103 ATCCTCCTCTGGGACGTTTT 

T A104 C ACT G AC AAGCTT GGTTT GC 

T A105 T GT CAAGATT GATT GCGTCT G 

T A106 GT GT GG AAGGTT GG A AT GGT 


Reverse primer (5'-3') 

AGTT GTT CT G AACCGGC ATT 
T C AAT G AATT GGTT GC AT AT GAT 
TTGCCCACGGTACTATTTCG 
GGCTT CCT GGC AGT AC AAAC 
ATTCCTTTTCACCCCCAATC 
TTTTCGTTCCTAGTCGTCATGG 
T GG ATTCCC AC AGTT G AAC A 
G AGT CG AT CT CG AT GGT GGT 
AGGGT GG ACTCCTTTT GG AT 
T GG ACTCCTT CT GG AT GTT G 
CGG A AG AGG AG A AT C AGT GG 
GGT CAT AAAGCGG ACG AG AA 
CAGGCTGGTGCATCCTAATA 
CGGCGTATCATCCAACAGTA 
CAACCGTTTTACGATAACAGCA 
CGTCCATCCTCAAGCTGTTT 
CGCCTC AACTT CG ACT GAA 
TCTCCTCATTCAAAGGATCG 
CTGCTGCTCACACTGTCCAT 
C AAAAC AG ATT GGCT GCCTT A 
C AGCGGTT G ACT GG AT CTT C 
CCGTTACGGCTCAAACATAAA 
C AAGGCCT GT CCCT G AAAT A 
T GCT GT GG AATCC AAATT G A 
GGCCTACGAACATCGAACAT 
TGCAGAGTCTCGGTTCACAG 
TGCTCTTTGGGGGTGATTTA 
CCCAATTCTCTTCAACTTCACA 
C ACT GT AAT GT CCCCG AT GG 
CCT CT GC AG AAC AAAAGTT GC 
GGT AT CGT GT GT GG AT C ATTT G 
GGCAACTTCGTCACTTTCGT 
GT GAGGCGGCAT GAGAATA 
GACCAGAGGCT GTTCACCTT 
AGCC AATT AAGGCC AGG AAC 
TCCGGTTTACCTGTCCAATC 
AT G AGCCTCTT GGC AT C AAA 

AG AAGTT GGT GGCT GT G ACC 

ATTCCCACGGTTGAACACAT 
ACCAGAAGGAGGAGGAGGAG 
T ATTT GCCCT GCG AAG AG AG 
T GGC AG AGC AAAT CAT GTT C 
CAT AGGCC AAAAACT AGT GT GC 
GAGTCTCGGCTGCAACCTTA 
CAT GGCG ACCT GGTT CTT AG 
AGTTTT GCCCGTT AGGGTTT 
T GT GC AATTT CTCG AACC AA 
TCC AG AG AGC A AT ATCG AT GG 
TTT GC AT G AAT G AAGT CTTT GG 
C AAT GT A ACGT AGCGG AG AG A 
C ACT AT GAT GCC ACCGTCT G 
AGCC AGACCC ACT ACAAAAT G 
GTTTGTTGTTGCGGTTGTTG 
TGCTCGGTGCACAATCATTA 


Gene function or putative function 

Pinus strobus homeobox transcription factor KN2 (Kn2) gene 

Ribosomal protein S7p/S5e family protein 

chloroplast-encoded 23 S ribosomal RNA 

cell elongation protein / DWARF 1 / DIMINUTO (DIM) 

Pectin lyase-like superfamily protein 

unknown protein 

plasma membrane intrinsic protein 1 ;4 
heat shock cognate protein 70 
ubiquitin 13 
Symbols: UBQ13 

Oxidoreductase, zinc-binding dehydrogenase family protein 

KNOTTED 1-like homeobox gene 4 

unknown protein 

Peroxidase superfamily protein 

Translation initiation factor SUI1 family protein 

Ubiquitin family protein 

Symbols: ATEXPB1, EXPB1, ATHEXP BETA 1.5 
O-Glycosyl hydrolases family 17 protein 
Translation initiation factor SUI1 family protein 
unknown protein 

chloroplast-encoded 23S ribosomal RNA 
polyubiquitin 10 
Remorin family protein 

Leucine-rich repeat transmembrane protein kinase 

ubiquitin 4 

unknown protein 

pseudogene, hypothetical protein 

Transducin/WD40 repeat-like superfamily protein 

unknown protein 

CAM8, AtCML8 

WUS, PGA6, WUSl,Homeodomain-like superfamily protein 

mitochondrial 26S ribosomal RNA protein 

unknown protein 

unknown protein 

rRNA 

vacuolar protein sorting 41 | 

Heat shock protein 70 (Hsp 70) family protein 

ATEXPA8, EXP 8, ATEXP8, ATHEXP ALPHA 1.11, 

EXPA8 

plasma membrane intrinsic protein IB 
pseudogene, glycine-rich protein 
unknown protein 

Translation protein SH3-like family protein 
unknown protein 

pseudogene, similar to fiber protein Fbl2, 

Heat shock protein 70 (Hsp 70) family protein 
UBQ10 

Histone superfamily protein 

GTP binding Elongation factor Tu family protein 

transposable element gene 

tubulin beta-7 chain 

GTP binding Elongation factor Tu family protein 
ADP-ribosylation factor A1E 
RING-H2 group FI A 
phenylalanine ammonia-lyase 2 
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Discussion 

Molecular markers are useful tools for genetics and breeding 
research. In recent years, high-density oligonucleotide microarr- 
arys and next-generation sequencing technologies have resulted 
in a considerable increase in the amount of available genome 
sequence data (Brady et al. 2009). However, for many species, 
especially woody plants, publicly available sequence information 
remains limited. At present, it is not yet feasible to sequence the 
genome of a particular species under study without 
multi-laboratory or multi-national efforts. However, as we veri¬ 
fied in this paper, EST data can work together with available 
whole genomic sequences to develop genetic tools for taxonomic 
groups closely related to the EST sets. The primers for our puta¬ 
tive ACGM loci were verified in several species within the 
Taxodiaceae. According to these results, it is now possible to 
follow similar strategies to develop primers in a variety of spe¬ 
cies for which sequence data are few, making ACGM marker 
development a reasonable choice for these species. 

Following the ACGM procedure, primers are designed within 
conserved homologous regions of coding sequences. Therefore 
to exploit ACGM to a greater extent the reference sequence, in 
our case Arabidopsis, should be as closely related as possible to 
the chosen experimental species that the EST data represents. 
Logically, it would have been better to use the sequence data 
from a woody plant such as poplar than Arabidopsis. The draft 
genome sequence of poplar has been completed and is available 
(GDB). However, when we attempted to blast the EST sequences 
of C. japonica against poplar data, two issues arose. First, the 
outcome was no more useful than that fox Arabidopsis alone (i.e. 
no major increase in homologous regions was seen, data not 
shown). Second, the annotation for poplar coding sequences is 
not as developed as it is in Arabidopsis. This annotation problem 
would make further aspects of the study more ambiguous. In 
view of these issues we recommend to use Arabidopsis as a ref¬ 
erence sequence when searching for consensus regions when 
undertaking ACGM marker development in selected woody spe¬ 
cies such as those of the Taxodiaceae. 

Using ACGM for marker development is, however, not with¬ 
out its disadvantages. ACGM are genetic markers residing in 
gene sequences and they can directly reflect aspects of variation 
within those genes. Therefore, the maps constructed with ACGM 
markers could be especially valuable for genetic studies but lev¬ 
els of polymorphism could be low because these ACGMs are in 
expressed regions which have more evolutionary conservation 
compared with primers derived from untranscribed sequences. 
Nevertheless, ACGM primers can be useful for comparative 
genomic studies precisely because they are designed in expressed 
regions. This technique is applicable to a wide range of species 
and shows a linear relationship among different genomes within 
a genus (Lu et al. 2006b). 
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